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An increasing incidence of enteric disorders clinically suggestive of the poult enteritis complex has been 
observed in turkeys in France since 2003. Using a newly designed real-time reverse transcriptase-polymerase 
chain reaction assay specific for the nucleocapsid (N) gene of infectious bronchitis virus (IBV) and turkey 
coronaviruses (TCoV), coronaviruses were identified in 37% of the intestinal samples collected from diseased 
turkey flocks. The full-length spike (S) gene of these viruses was amplified, cloned and sequenced from three 
samples. The French S sequences shared 98% identity at both the nucleotide and amino acid levels, whereas 
they were at most 65% and 60% identical with North American (NA) TCoV and at most 50% and 37% identical 
with IBV at the nucleotide and amino acid levels, respectively. Higher divergence with NA TCoV was observed 
in the Sl-encoding domain. Phylogenetic analysis based on the S gene revealed that the newly detected viruses 
form a sublineage genetically related with, but significantly different from, NA TCoV. Additionally, the RNA- 
dependent RNA polymerase gene and the N gene, located on the 5’ and 3’ sides of the S gene in the coronavirus 
genome, were partially sequenced. Phylogenetic analysis revealed that both the NA TCoV and French TCoV 
(Fr TCoV) lineages included some IBV relatives, which were however different in the two lineages. This 
suggested that different recombination events could have played a role in the evolution of the NA and Fr 
TCoV. The present results provide the first S sequence for a European TCoV. They reveal extensive genetic 


variation in TCoV and suggest different evolutionary pathways in North America and Europe. 


Introduction 


Coronaviruses (CoVs) consist of large, enveloped and 
positive-stranded RNA viruses within the order Nidovir- 
ales. They possess an approximately 30-kb-long genome 
from which they transcribe a set of multiple 3’-coterminal 
nested subgenomic mRNAs (Masters, 2006). The virions 
are pleomorphic enveloped particles, roughly spherical, 
with diameters ranging from 50 to 200 nm. They possess 
long petal-shaped spikes on their membrane that are 
responsible for the CoV typical crown-shaped morphol- 
ogy in electron microscopy. CoVs infect a wide variety of 
avian and mammalian species and cause primarily 
respiratory or enteric diseases, but also in some cases 
neurologic illness or hepatitis. In humans, the outbreak of 
severe acute respiratory syndrome (SARS) caused in 
2003 by a CoV has led to an increased interest in family 
Coronaviridae and its potential animal reservoirs. Based 
on the latest proposals to the International Committee 
on the Taxonomy of Viruses, family Coronaviridae in the 
Nidovirales order includes two sub-families, Coronavir- 
inae and Torovirinae, the former comprising three genera, 
Alphacoronaviruses, Betacoronaviruses and Gammacoro- 
naviruses (deGroot et al., 2008). 


The proposed Gammacoronavirus genus groups mostly 
coronaviruses isolated from birds, with the exception of 
the SW1 and ALC/GX/230/06 viruses isolated from a 
beluga whale (Mihindukulasuriya et a/., 2008), and from 
the Asian leopard cat or Chinese ferret badger (Dong 
et al., 2007), respectively. A first proposed species, avian 
coronavirus (AvCoV), corresponds to the former “sub- 
group 3a within the coronavirus genus”. It groups 
isolates obtained from Galliformes (chicken, turkey, 
pheasant, peafowl, quail), Columbiformes (pigeon), Ana- 
tidae (teal, goose, duck, swan), Charadriiformes (red 
knot and oystercatcher) and possibly Psittaciformes 
(Cavanagh et al., 2002; Jonassen et al., 2005; Liu et al, 
2005; Gough et al., 2006; Qian et al., 2006; Circella et al., 
2007; Hughes et al., 2009). The second proposed species 
within the Gammacoronavirus genus is Beluga whale 
coronavirus SW1, and corresponds to the former 3b 
subgroup (Mihindukulasuriya et al, 2008). A third 
group of isolates (former 3c subgroup) contains viruses 
obtained from passerines (bulbul, munia, and thrush) 
(Woo et al., 2009) and from the Asian leopard cat and 
Chinese ferret badger (Dong et al., 2007; Who et al., 
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2009); however, these isolates have not yet been assigned 
to any species within the Gammacoronavirus genus. 

The most economically significant viruses among 
AvCoVs are infectious bronchitis viruses (IBV) and 
turkey coronaviruses (TCoV), affecting the poultry 
industry. IBV causes avian infectious bronchitis, a 
common, highly contagious, and acute viral respiratory 
and genital disease of domestic fowl with worldwide 
distribution and major economic consequences 
(Cavanagh & Gelb, 2008). TCoV was shown in the 
1970s to be one of the causative agents of an enteric 
disease known as bluecomb, transmissible enteritis or 
coronaviral enteritis of turkeys (Guy, 2008). More 
recently, TCoV has been associated with poult enteritis 
complex (PEC), which groups several infectious intest- 
inal disorders of young turkeys up to 7 weeks of age. 
Clinical signs include diarrhoea, stunting, dehydration, 
anorexia, weight loss and immune dysfunction. When 
associated with mortality, this disease is designated as 
poult enteritis and mortality syndrome (PEMS) (Barnes 
et al., 2000). 

CoVs share a common genomic organization that 
consists of two open reading frames (ORFs) in the 5’- 
end that encode a single viral replicase and a varying 
number of genes in the 3’ end that encode, among other 
products, the four major structural proteins (spike S, 
envelope E, membrane M and nucleocapsid N). The 
viral replicase is cleaved by proteases leading to the 
release of 15 or 16 proteins, among which is the ORF1b- 
encoded RNA-dependent RNA polymerase (RdRp). 
Among the four major structural proteins, N is a 
phosphoprotein of approximately 50 kDa that binds 
the genomic RNA. It is an immunogenic determinant for 
humoral and cellular immunity (Collisson et al., 2000; 
Tang et al., 2008). The S glycoprotein forms the large, 
club-shaped projections on the external surface of the 
virion envelope. S contains two subdomains S1 and S2 
that are, in most Betacoronaviruses and Gammacorona- 
viruses, cleaved by a trypsin-like host protease (Masters, 
2006). The S2 subdomain anchors the spike into the 
virus membrane whereas the S1 subdomain forms the 
extracellular globular portion of the spike and supports 
the host receptor-binding activity (Kubo et al, 1994; 
Krempl et al., 2000; Wong et al., 2004). In addition, the 
Sl glycoprotein contains major epitopes that induce 
neutralizing antibodies (Cavanagh et al., 1988). 


So far, only North American strains of TCoV (NA 
TCoV) have been extensively sequenced and character- 
ized at the molecular level (Lin et al., 2004; Cao et al., 
2008; Gomaa et al., 2008; Jackwood et al., 2010). These 
data clearly showed sequence homogeneity between the 
studied strains with at least 92.4% nucleotide identity for 
the full-length genomes and 91% amino acid identity for 
the S proteins. The percentage identity between TCoV 
and IBV isolates was also high for the full-length genome 
(higher than 86%), indicating a close genetic relationship 
between these two viruses; however, the S proteins of the 
compared TCoV and IBV shared at most 36% amino 
acid identity. This finding led to the hypothesis that 
TCoV might have emerged from IBV through recombi- 
nation (Jackwood et al., 2010). In Europe, only short 
nucleotide sequences corresponding to the 3’ end of the 
genome of an IBV-related CoV have been detected from 
turkeys in the UK (Cavanagh et al., 2001), in Italy 
(Moreno Martin et al., 2002), and in Poland (Domans- 
ka-Blicharz et al, 2010). However, since 2003 an 
increasing number of turkey flocks exhibiting clinical 
signs compatible with PEC have been observed in France 
(Germain & Rousseau, 2005). In this paper we report on 
the development of a Taqman® quantitative reverse 
transcriptase-polymerase chain reaction (RT-qPCR) sui- 
table for the detection of IBV and TCoV, on the 
detection in France of CoV-positive turkey flocks with 
signs suggestive of PEC, and on the sequencing of the 
entire hypervariable S gene, together with fragments of 
the N and RdRp genes of these strains. 


Materials and Methods 


Viruses and clinical samples. The IBV and TCoV strains used in the 
validation of the RT-qPCR are listed in Table 1. Twenty-one viruses not 
belonging to the Gammacoronavirus genus but representing avian 
metapneumoviruses, paramyxoyviruses and orthomyxoviruses, adeno- 
viruses, reoviruses, avibirnaviruses, rotavirus and astroviruses were also 
tested (Guionie et al., 2007). Clinical samples (duodenum, jejunum, 
caecum, kidney, spleen, liver, bursa of Fabricius or/and thymus) were 
received for virological investigation from 81 turkey flocks collected in 
western France and presenting with signs suggestive of PEC between 
April 2007 and October 2009. Twenty-three digestive contents collected 
before 1988 from meat turkey flocks with enteric disorders (Andral 
et al., 1985) were also investigated. The pre-1988 samples had been kept 
frozen at —20°C since harvest. All samples were ground and suspended 
w/v in phosphate-buffered saline. The suspensions were centrifuged at 


Table 1. Strains used for testing the specificity of the AvCoV RT-gPCR. 


Virus Origin Reference in authors’ laboratory Reference or source 

IBV strain 84084 France PL 84084/5.5 Picault et al. (1986) 

IBV strain 84221 France CR 84221/6.1 Picault (1987) 

IBV strain 84222 France CR 84222/6.1 Picault (1987) 

IBV strain 85131 France CR 85131/13.1 Picault (1987) 

IBV strain 88061 France CR 88061/8.3 Picault et al. (1988) 

IBV strain 88121 France CR 88121/11.4 Picault et al. (1988) 

IBV vaccine (Massachusetts type)  — CR B48/1.1 Cevac® Mass L (CEVA Santé animale, France) 
IBV strain 4/91 UK CR 4-91/3.1 Parsons et al. (1992) 


IBV vaccine (D274b type) 
IBV strain D212 

IBV strain D3128 

IBV strain D3896 


Netherlands CR D274b/6.3 
Netherlands CR D212/56 

Netherlands CR D3128/64 
Netherlands CR D3896/52 


IBV strain Connecticut USA CR Conn/5.1 
IBV strain Beaudette USA VBBPa/3.8 
TCoV USA TCoV INP4/1.1 


Davelaar et al. (1984) 

Davelaar et al. (1984) 

Davelaar et al. (1984) 

Davelaar et al. (1984) 

Jungherr et al. (1956) 

Picault (1987) 

Kind gift from Prof. C.C. Wu and Prof. Y.M. Saif 
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3000 x g for 15 min and the supernatants were stored at —70°C prior to 
RNA extraction. 


RNA extraction. RNA was extracted from 140 ul viral suspension using 
the QIAamp® Viral RNA Mini Kit (Qiagen, Hilden, Germany), 
according to the manufacturer’s instructions. The purified RNA was 
eluted in 60 pl AVE buffer and stored at —70°C prior to RT-qPCR. 


Development and validation of the RT-qPCR. To easily screen the field 
samples and avoid a lack of amplification of some European viruses due 
to their possible genetic variation, it was decided not to rely on the 
methods previously developed for TCoV but to develop a broad- 
spectrum RT-qPCR specific for both IBV and TCoV. The oligonucleo- 
tide primers and Taqman probe were designed with the Primer 
Express® software (Applied Biosystems, Warrington, UK) from a 
conserved region in the middle of the N gene. The primers and probe 
were screened with the NCBI BLAST program to check for the lack of 
cross-reaction with previously released cellular, viral or bacterial 
sequences. The forward primer Ncor800f, reverse primer Ncor860r 
and the Taqman® dual-labelled probe Ncor830p (Table 2) were 
synthesized by Applied Biosystems and amplified a 66-bp cDNA. The 
one-step Taqman® AvCoV RT-qPCR assay was run using the 
Quantitect™ probe RT-PCR Kit (Qiagen). Briefly, each mix contained 
1 x (12.5 wl) Quantitect™ RT-PCR master mix, 0.25 ul Quantitect™ RT 
mix, 300 nM each primer, 100 nM probe, 5 pl template RNA and 
RNase-free water to a final volume of 25 pl. RT-qPCR was performed 
using a Taqman 7000 thermocycler (Applied Biosystems) under the 
following cycling conditions: 48°C for 30 min (reverse transcription), 
95°C for 10 min (RT inactivation and activation of the HotStartTaq 
DNA polymerase); and 40 cycles combining 95°C for 15 sec (denatura- 
tion) and 60°C for 1 min (annealing, extension step, and fluorescence 
data collection). Data were analysed with the SDS software, version 
1.3.1 (Applied Biosystems). 

To assess the analytical sensitivity of detection, a fragment of the N 
gene encompassing the region amplified by the newly developed RT- 
qPCR assay was amplified with the Ncorlf and Ncor860r primers 
(Table 2), then was cloned into the peDNA 3.1™Directional TOPO® 
Expression Kit (Invitrogen, Carlsbad, California, USA) according to 
the manufacturer’s procedure. The resulting plasmid was linearized at 
the Eco RV (New England Biolabs, Hitchin, UK) restriction site located 
downstream of the insert. Runoff transcripts were synthesized using the 
T7 RNA polymerase (Promega, Madison, Wisconsin. USA) according 
to the standard protocol. Following the transcription reaction, the 
DNA templates were removed by digestion with the RQ] RNase-free 
DNase (Promega). The lack of residual contaminating DNA was 
assessed by checking negativity in a qPCR assay (RT-qPCR without 
the reverse transcriptase enzyme). The in vitro transcripts were extracted 
with phenol/chloroform, resuspended in nuclease-free water, aliquoted 
and stored at —70°C. They were quantified by measuring the Ay¢o with a 
spectrophotometer to calculate the number of copies. To evaluate the 
amplification efficiency, the RT-qPCR assay was tested using 10-fold 
dilution series from 5 x 10” to 5x 10° copies per reaction of RNA 
runoff transcripts. 
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Full-length amplification and sequencing of the S gene. Some field 
samples with a positive RT-qPCR result were used for the full-length 
amplification and sequencing of the S gene. To amplify the full-length S 
gene, RT-PCR was performed using primers defined in conserved 
regions flanking the S gene, namely primer Scor + 260f in the 3’ end of 
gene 1b and primer Scor-150r in the 5’ end of gene 3a (Table 2). RNA 
was reverse transcribed at 42°C for 75 min using Superscript II reverse 
transcriptase (Invitrogen) and primers Scor-150r. Four microlitres 
of cDNA were amplified with primer pairs Scor + 260f/Scor-150r 
(Table 2). PCR was performed with the Expand High Fidelity PCR 
kit (Roche, Mannheim, Germany) according to the manufacturer’s 
instructions. The PCR products were purified using the Gene Elute kit 
(Qiagen) and cloned into the pGEM®-T easy vector (Promega) 
according to the manufacturer’s instructions. At least three positive 
clones for each sample were sequenced in both directions. The primers 
defined in the newly determined sequence were used according to a 
gene-walking strategy (primers available upon request). PCR and 
sequencing were performed with the Big Dye® terminator cycle kit 
and the Genetic Analyzer 3130 system according to the manufacturer’s 
recommendations (Applied Biosystems). 


Partial amplification and sequencing of the N and RdRp genes. Some 
field samples with a positive RT-qPCR result were also used for the 
partial amplification and sequencing of both the N and RdRp genes. 
RNA was reverse transcribed at 42°C for 75 min using Superscript II 
reverse transcriptase (Invitrogen) and primers Ncor860r or POL- 
cor2590r for the N or RdRp genes, respectively. Four microlitres of 
cDNA were amplified with primer pairs Ncorlf/Ncor860r or POL- 
cor1900f/POLcor2590r (Table 2) for the N or RdRp genes, respectively. 
PCR was performed with the Expand High Fidelity PCR kit (Roche) 
according to the manufacturer’s instructions. Amplicons were purified 
after agarose gel electrophoresis using the Gene Elute kit (Qiagen). The 
purified amplicons were sequenced as described for the S gene, using the 
PCR primers. 


Similarity searches, phylogenetic analysis and protein motif predictions. 
The nucleotide—nucleotide or protein—protein BLAST search analyses 
to define the closest IBV or NA TCoV relatives of the newly determined 
sequences were performed online at the National Center of Biotechnol- 
ogy Information (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Percentage 
nucleotide identities in pairwise alignments were determined with the 
MEGA software (Kumar et al., 2008). For phylogenetic analyses, 
coronavirus sequences were retrieved from databanks that were closely 
related to TCoV, as detected by BLAST, or were representative of IBV 
isolates from the USA, Europe or China and had their S, N and RdRp 
genes sequenced, or were representative of other virus species within the 
Gammacoronavirus genus (Table 3). The nucleotide sequences were first 
translated and the deduced protein sequences were aligned using the 
BLOSUM matrix. The alignment of nucleotide sequences was then 
deduced from the alignment of protein sequences. Phylogenetic analysis 
were finally inferred using the neighbour-joining method (with the 
Kimura two-parameter model), using the MEGA software, and with the 
maximum likelihood method using the PhyML software (Guindon & 
Gascuel, 2003). All methods were implemented with bootstrap on 1000 
replicates and using the SARS-CoV (Betacoronavirus) sequence as an 
outgroup for maximum likelihood method. 


Table 2. Sequence and position of the oligonucleotides used in this study. 


Name* Sequence (5’ > 3’)° Position in TCoV genome Use 

Neor800f CGTGTTACGGCAATGCTCAA 26,852 to 26,871 RT-qPCR 

Neor860r CGTCACTCTGCTTCCAAAAAGAC 26,917 to 26,895 RT-qPCR and RT/PCR for sequencing N gene 
Neor830p fam-CCCTAGCAGCCATGC-mgb 26,878 to 26,892 RT-qPCR probe 


Neorlf CACCATGGCAAGCGGTAAGGC 


Scor + 260f GAATGCGTCTTCTTCAGAAGC 
Scor-150r TGTGCCAAAGCAGAAGTCTAG 
POLcor1900f AAGTGTGATAGAGCAATGCC 
POLcor2590r CTCCATAACAGCCACAGG 


26,050 to 26,070 
20,096 to 20,116 
24,150 to 24,130 
14,196 to 14,215 
14,906 to 14,889 


PCR for sequencing N gene 

PCR for cloning S gene 

RT/PCR for cloning S gene 

PCR for sequencing RdRp gene 
RT/PCR for sequencing RdRp gene 


*f, forward primers; r, reverse primers; p, probe. >fam, 6-carboxyfluorescein; mgb, minor groove binder. ‘Relative to the genome of 


TCoV (NC_010800). 
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Table 3. Virus sequences included in the phylogenetic study. 


Virus Origin Accession number* Reference 
QCoV Italy/Elvia/2005 Italy EF446155/na°/na Circella et al. (2007) 

IBV strain 4/91 UK AF093794/EU780081/FN811147 — Callisson et al. (2001), Meir et al. (2010), present study 
IBV strain BJ China AY319651 Jin et al., unpublished 

IBV strain ITA/90254/2005 (Italy?) FN430414 Ducatez et al. (2009) 

IBV strain Ark DPI101 USA EU418975 Ammayapan et al. (2009) 

IBV strain California99 USA AY514485 Mondal & Cardona (2007) 

IBV strain CK/CH/LSD/05I_ — China EU637854 Wang et al., unpublished 

IBV strain M41 USA DQ834384 Mondal et al., unpublished 
TCoV strain 540 USA EU022525 Cao et al. (2008) 

TCoV strain IN517/94 USA GQ427175 Jackwood et al. (2010) 

TCoV strain MG10 USA EU095850 Gomaa et al. (2008) 

TCoV strain VA74/03 USA GQ427173 Jackwood et al. (2010) 

TCoV strain GI USA AY 342357 Jackwood & Hilt, unpublished 
TCoV strain Gh USA AY 342356 Jackwood & Hilt, unpublished 
TCoV strain TX-1038/98 USA GQ427176 Jackwood et al. (2010) 

TCovV strain ATCC USA EU022526 Cao et al. (2008) 

TCoV strain NC95 USA na/AF111997/na Breslin et al. (1999) 

CoV SW1 EU111742 Mihindukulasuriya et al. (2008) 


ThCoV strain HKU12/600 Hong Kong —_FJ376621 
BuCoV strain HKU11/934 Hong Kong —_FJ376619 
MuCoV strain HKU13/3514 Hong Kong  FJ376622 
AlcCoV-Guangxi/F230/2006 = China EF584908 


Woo et al. (2009) 
Woo et al. (2009) 
Woo et al. (2009) 
Dong et al. (2007) 


*A single accession number (whole genome) or three accession numbers (in the S/N/RdRp order) are indicated. ona, sequence not 


available. 


Concerning the motif predictions from the amino acid sequence of 
the S protein, we used the ProP server (http://www.cbs.dtu.dk/services/ 
ProP/) to detect putative peptide cleavage sites, the TopPred website 
(http://mobyle.pasteur.fr/cgi-bin/portal.py?form =toppred) to predict 
hydrophobicity profiles, and the COILS program (http://www.ch.emb- 
net.org/software/COILS_form.html) to deduce heptad repeat regions 
relative to coil-coiled structure. 


Accession numbers for the reported sequences. The gene sequences for 
FRO070341j, FRO80147c, FR080183j and IBV-4/91 were submitted to 
the EMBL database and have been assigned accession numbers: full 
spike genes of FRO070341j, FRO80147c and FR080183j are under 
accession numbers GQ411201, FN434203, FN545819, respectively. 
The accession numbers for the partial N gene sequences of 
FRO070341j, FR080147c and FR080183j are FN665664 to FN665666, 
respectively. The accession numbers for the partial RdRp gene 
sequences of FRO70341j, FR080147c, FR080183j and IBV-4/91 are 
FN811144 to FN811147, respectively. 


Results 


Specificity and sensitivity of the RT-qPCR. Primers 
Neor800f and Neor860r and probe Neor830p detected 
efficiently the genome of both TCoV-INP4 and the 14 
strains of IBV tested in this study (Table 1). All of the 
non-CoV controls (Guionie et al., 2007) produced a 
cycle threshold (Ct) higher than 35. 

A linear standard curve of template copy number 
against Ct value was established using serial 10-fold 
dilutions of in vitro transcripts. The detection range was 
between 107° and 10° copies/ul, with a slope of — 
3.48+0.09 indicating an amplification efficiency of 
94.0+3.3% and a coefficient of correlation (R*) of 
0.999 +0.001 for 18 independent repeated runs (data 
not shown). The linear (quantification) range was 
between 10°° and 10° copies/l. 


RT-qPCR applied to clinical samples. Out of 52 flocks 
with clinical signs suggestive of PEC/PEMS, 19 flocks 


(36.5%) were positive, whereas only five out of 29 
(17.2%) were positive among flocks without enteritis or 
with an enteritis not suggestive of PEC/PEMS (P <0.07, 
chi-squared test). Positive RT-qPCR results were ob- 
tained predominantly from the content of the jejunum 
(41.3%) and caecum (39.1%) and in the bursa of 
Fabricius (38.6%). RT-qPCR did not reveal any positives 
out of 23 clinical samples that had been collected during 
previous surveys of turkey enteric disorders performed 
before 1988 (Andral et al., 1985). 


Genetic analysis of the CoV strains based on the complete 
spike sequences. The S ORF, as amplified from the 
genomes of the French TCoV (Fr TCoV) FRO070341j, 
FRO080147c and FR080183j viruses, was 3597-nucleo- 
tides long (including the stop codon). This size was 
intermediate between that of TCoV (3612 to 3681 
nucleotides) and IBV (3489 to 3510 nucleotides) S genes. 
A typical transcription-regulating sequence motif 
(Zuniga et al., 2004), CUGAACAA, was identified 52 
nucleotides upstream of the S ORF initiation codon, as 
in all other IBV and TCoV genes sequenced to date. The 
French S sequences shared 98% nucleotide identity, 
whereas they were at most 65% identical with TCoV- 
ATCC, one of their closest NA TCoV relatives, and 
merely 50% identical with IBV-California99 (also origi- 
nating from the USA), one of their closest IBV relatives, 
both as detected with the Blast programme. The 
sequence differences were most obvious in the region 
encoding the S1 subdomain, where the French sequences 
shared 97% nucleotide identity, but at most only 56% 
and 38% nucleotide identity with NA TCoV and IBV 
strains, respectively. 

The consensus tree resulting from the phylogenetic 
analysis of the nucleotide sequences encoding the S1 
subunit of Gammacoronaviruses is shown in Figure la. 
The tree corresponding to the full S gene had a similar 
topology with significant bootstrap support (data not 
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shown). A first significant cluster (99% bootstrap value) 
grouped all the selected sequences from IBV and TCoV 
and was consistent with isolates belonging to the AvVCoV 
species. The SW1 virus from a beluga whale made up a 
second lineage, which was consistent with this virus 
representing a separate species. The three viruses, 
ThCoV, BuCoV, and MuCoV detected in thrush, bulbul, 
and munia, respectively, clustered with the Asian leopard 


(a) 


Molecular characterization of French turkey coronavirus 183 


cat virus in a third significant cluster (100% bootstrap 
value). Within AvCoVs, two major genetic lineages were 
supported by a 100% bootstrap value: the first grouped 
all selected IBV, whereas the second grouped all NA 
TCoV sequenced to date, the sequence from a quail CoV 
(QCoV) recently identified in Italy (Circella et al., 2007) 
and the FRO070341j, FR080147c and FR080183j se- 
quences. However, in agreement with the percentage 
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SW1(EU111742) 

ThCoV-HKU12/600(FJ376621) 
BuCoV-HKU11/934(FJ376619) 

MuCoV-HKU 13/351 4(FJ376622) 


AlcCoV-Guangxi/F230/2006(EF584908) 


Consensus phylogenetic tree resulting from the analysis of the nucleotide sequences encoding the SI subunit of the spike 


glycoprotein (la), of a 795 bp fragment of the N gene (1s), or of a 615 bp fragment of the RdRp gene (Ic) in Fr TCoV strains (black 
dots), the NA TCoV (white dots) and other representative Gammacoronaviruses (accession number in parentheses). The trees were 
constructed using the neighbour-joining method and Kimura two-parameter model. Bootstrap values and the resulting consensus tree were 
calculated from 1000 trees. Only bootstrap values > 75% are indicated. The geographical origin and literature reference of the sequences 


are indicated in Table 3. 
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Figure 1. (Continued). 


nucleotide identity discussed above, the three French 
sequences clustered significantly apart in a significant 
sublineage (100% bootstrap value) (Figure la). 

The deduced putative S protein of FRO070341), 
FRO080147c and FR080183j was 1198 amino acids 
long. The pairwise comparisons of these amino acid 
sequences revealed 98% amino acid identity between the 
three French strains, which shared at best 61% overall 
identity with the NA TCoV and 37% amino acid identity 
with the IBV strains used in this study. The Sl 
subdomain (amino acids | to 529) of the three French 
strains shared 96% amino acid identity, whereas 
maximum amino acid identity was only 42% and 18% 
with NA TCoV and with the closest IBV relatives, 
respectively. The alignment of the $1 subdomains of 
the French sequences with that of NA TCoV revealed 
two zones with different levels of amino acid identity 
(Figure 2). Indeed, positions 1 to 196 (Sla) exhibited 
only 35 (18%) fully conserved residues shared by all 
strains from turkeys or quail, whereas amino acid 
positions 197 to 529 (Slb) contained 144 (43%) such 
residues. It is tempting to speculate that high amino acid 
divergence in the Sla region supports an antigenically 
significant hypervariable region (HVR) similar to the 
HVRI1 and HVR2 regions described in IBV S1 (Cava- 
nagh et al., 1988). However, further molecular and 
antigenic studies on TCoV will be required to substanti- 
ate this hypothesis. At the carboxy terminal end of the 
S1 subdomain, the three French strains shared the same 
putative glycoprotein furin cleavage recognition site, 
TRSRR/S (amino acid positions 525 to 529), which 
appeared to be unique among IBV and TCoV strains. 

The S2 subdomain, carboxyterminal to the cleavage 
site, proved more conserved (99%, 76% and 52% 
maximum amino acid identity among French sequences, 
between Fr and NA TCoV and between Fr TCoV and 
IBV sequences, respectively). Its predicted hydrophobi- 
city profile was consistent with three extracellular, 
transmembrane and intracytoplasmic domains spanning 
amino acid positions 530 to 1120, 1121 to 1159, and 
1160 to 1198, respectively. As recently reported with S2 


AlcCoV-Guangxi/F230/2006(EF584908) 


of other CoVs (Yamada & Liu, 2009), the extracellular 
domain of the newly determined TCoV S2 sequences 
exhibited the consensus SAIEDLLF amino acid stretch 
(amino acid positions 694 to 701), located immediately 
carboxy terminal to the SGKPQGR sequence. This 
sequence did not correspond to any furin cleavage site, 
unlike that observed in several IBV isolates; however, it 
was identified as the second most probable protein 
cleavage site (after the S1/S2 cleavage site) in the TCoV 
S protein, and it further fits the XX XR/S model recently 
proposed for a second cleavage of the S protein, which 
appears critical for CoV-induced cell-to-cell fusion in 
cultured cells (Yamada & Liu, 2009). The search of the 
S2 subdomain for heptad repeats (HRs), possibly 
indicative of a coiled-coil tertiary structure, revealed 
two compatible areas spanning amino acid positions 792 
to 948 and 1056 to 1141, respectively. These positions 
were consistent in the alignment of the S proteins with 
the regions previously proposed as encompassing HR1 
and HR2 in IBV (Bosch et al., 2004). Interestingly, the 
alignment demonstrated that all TCoV sequences ex- 
hibited an exact insertion of two HRs (14 amino acid) in 
both HR1 and HR2. These insertions occurred in close 
vicinity (only four amino acids apart) from the sites 
where similar insertions had been previously recognized 
(although with different inserted amino acids) as char- 
acteristic of HRI and HR2 in all Alphacoronaviruses 
(Bosch et al., 2004). The second predicted HR2 in the 
TCoV extracellular domain involved the most amino 
terminal amino acid of the transmembrane domain, 
which also belonged to the YIKWPWYVWL amino 
acid stretch that is highly conserved among the three 
CoV genera (Rota et al., 2003). Finally, the predicted 
S intracellular domain was also highly conserved be- 
tween TCoV and IBV and included the YYTTF signal 
involved in the retention of IBV S at a late Golgi 
compartment (Winter e¢ al., 2008). 


Genetic analysis of the CoV strains based on the 
nucleocapsid and polymerase sequences. For each ana- 
lysed gene, all methods used for phylogenetic analysis 
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Figure 2. Amino acid alignment of the S1 subunit of TCoV-FRO70341j in comparison with that of the two other Fr TCoV, QCoV and 
several NA TCoV isolates. Black shading represents conserved amino acids among all TCoV strains. Dots indicate amino acid residues 
that are identical with the FRO70341j sequence. The highly variable Sla region is underlined, and the putative cleavage site XRXRR 


stretch is boxed. 


resulted in trees with a consistent topology (Figure 1b,c). 
Irrespective of the analysed gene, the SW1 CoV and the 
group made of ThCoV, BuCoV, MuCoV and the Asian 
leopard cat CoV always represented significant separated 
genetic lineages, whereas IBV and TCoV strains always 
grouped significantly together within a cluster consistent 
with the AvCoV species. 


In the N gene, sequences FR070341j, FRO80147c and 
FR080183j shared 94 to 97% nucleotide identity. They 
shared 95 to 99% and 91 to 92% nucleotide identity with 
their closest IBV (strain 4/91) or TCoV (strain IN-517/ 
945) relatives (as identified with the Blast programme), 
respectively. Within AvCoV, and as already observed 
with the S1 gene, distinct sublineages were apparent, 
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containing either the NA or the Fr TCoV sequences, as 
supported by 84% and 100% bootstrap values, respec- 
tively (Figure 1b). However, unlike previously observed 
with the S1 gene, the IBV sequences did not cluster in a 
specific sublineage. Indeed, different IBV strains with 
close phylogenetic relationships with TCoVs were ap- 
parent, either in the significant sublineage containing all 
NA TCoVs (IBV strains CK/CH/LSD/051, ArkDPI101 
and California99, 84% booststrap value), or in the 
sublineage containing the three newly determined se- 
quences (IBV-4/91, 100% bootstrap value with TCoV- 
FRO070341j). Additional IBV strains were also present in 
the AvCoV lineage, but without any significant connec- 
tion to the two above-mentioned sublineages (IBV 
strains M41, ITA/90254/2005 and BJ). 

In the RdRp gene, sequences FRO70341j, FR080147c 
and FR080183j shared 95 to 98% nucleotide identity, 
and at most 95 and 94% nucleotide identity with their 
closest IBV (strain ITA/90254/2005) and TCoV (strain 
IN-517/94) relatives, respectively. Within the AvCoV 
lineage, the three French strains clustered significantly 
together again, in a sublineage that also contained IBV 
strain ITA/90254/2005 (90% bootstrap value in Figure 
Ic). As already observed with the S1 or N genes, a 
significant cluster (92% bootstrap value) grouped some 
NA TCoV (TX-1038/98, VA-74/03, MG10) with IBV 
strains (California99 and ArkDPI101). However, unlike 
observed previously, the bootstrap values did not sup- 
port the clustering of all NA TCoV sequences into it, as 
IN-517/94 and TCoV-ATCC were only loosely related 
(bootstrap values < 75%). In addition, the phylogenetic 
analysis suggested that IBV CK/CH/LSD/051 branched 
significantly apart from all other studied AVCoV RdRp 
sequences (95% bootstrap value), whereas this IBV 
strain had previously exhibited a N sequence genetically 
related to NA TCoVs (see above and Figure 1b). 


Discussion 


The aim of the present study was to further investigate 
the possible role of TCoV in digestive disorders of 
commercial turkeys in France and to refine the mole- 
cular characterization of the detected viruses. 

As a first step, we therefore decided to develop a RT- 
qPCR assay specific for the most economically signifi- 
cant AvCoV (i.e. IBV and TCoV). We selected the N 
gene to design primers and probe, as it is present in all 
genomic and subgenomic CoV RNA transcribed in 
infected cells, thus ensuring sensitivity of detection, 
and also contains highly conserved regions allowing 
one to avoid false negatives due to genetic variation. 
Indeed, it has been reported for different avian RNA 
viruses that the genetic lineages that circulate in the USA 
and Europe may exhibit a very significant degree of 
genetic variation (Webster et al, 1992; Toquin et al, 
2006). The authors are aware of only one other published 
broad-spectrum RT-PCR assay suitable for the detection 
of both IBV and TCoV (Callison et al., 2006). This assay 
targeted the 3’-untranslated region of AvCoV, however, 
and not the N gene as targeted here. 

In France, the most studied AvCoV has so far been 
IBV, with outbreaks of infectious bronchitis in the late 
1980s leading to the first detection of the CR88 genotype 
(also named 793/B, or 4/91) (Cavanagh et al., 1998) that 
replaced the previously prevalent D274 and Massachu- 


setts serotypes (Davelaar et al., 1984; Picault et al., 
1986). More recently, two new genotypes, QX and 
Italy02, have emerged in Europe and in France 
(Worthington et al., 2008). There has so far been no 
report in France of any turkey infection by AvCoV. In 
Great Britain, however, Cavanagh et al. (2001) reported 
the presence of a coronavirus in turkeys with enteritis. 

Clinical data from the investigated flocks suggest a 
trend for this virus to be more frequently detected 
(36.5% flock prevalence) in flocks with digestive dis- 
orders (P <0.07, chi-squared test). In North America, 
PEMS has often been associated with the presence of 
TCoV (Brown et al., 1997; Guy et al., 1997; Loa et al., 
2000). The virus was detected in the epithelia of the 
intestinal tract and bursa of Fabricius (Nagaraja & 
Pomeroy, 1997). These anatomical locations correlate 
well with the samples identified in the present study as 
most contaminated by Fr TCoV (jejunum, caecum and 
bursa of Fabricius, which thus appear as the samples to 
be preferred for the molecular diagnosis of Fr TCoV). 
Many viruses other than TCoV have also been reported 
to be associated with enteritis in turkeys. These include, 
both in the Americas and in Europe, type 2 turkey 
astroviruses, picorna-like viruses, reoviruses, rotaviruses, 
and adenoviruses (Andral et al., 1985; Reynolds & Saif, 
1986; Gough & Drury, 1998; Guy, 1998; Koci et al, 
2000; Cattoli et al., 2007; Da Silva et al., 2008; Pantin- 
Jackwood et al., 2008a; Woolcock & Shivaprasad, 2008; 
Jindal et al., 2009b). However, experimental studies with 
these agents inoculated alone (Schultz-Cherry ef al., 
2000; Heggen-Peay et al., 2002; Pantin-Jackwood et al., 
2008b; Gomaa et al., 2009) or in combinations (Yu 
et al., 2000; Jindal et al., 2009a; Spackman et al., 2010) 
seldom reproduce the whole range of signs observed in 
the diseased flocks. Determining whether the newly 
detected Fr TCoV plays a significant role in the 
increased incidence of digestive disorders reported in 
French turkey flocks since 2003 (Germain & Rousseau, 
2005) will require isolation of the virus and further in 
vivo experimental studies. Conversely, whether the IBV 
isolates that appear phylogenetically related with Fr 
TCoV could possibly infect turkeys might deserve 
further experimental testing. 

In order to evaluate precisely the genetic relationships 
between the newly detected sequences and other avian 
CoVs, the entire S protein gene was sequenced. To the 
authors’ knowledge these are the first full-length se- 
quences of the S gene in a European TCoV. The three 
newly determined S sequences were highly homogeneous 
(98% nucleotide and amino acid identity) and the BLAST 
search revealed highest similarity with the published 
sequences of the S gene of NA TCoVs (approximately 
65% and 60% nucleotide and amino acid identity, 
respectively, E value =0). Consistently, these French 
sequences clustered significantly with previously pub- 
lished NA TCoV sequences in all phylogenetic analysis 
(100% bootstrap value in Figure la and data not shown). 
This confirmed that the newly detected viruses can indeed 
be considered bona fide Fr TCoV strains. 

Although significantly related to the previously pub- 
lished NA TCoV sequences, the Fr TCoV sequences 
exhibited striking original features. Indeed, the genetic 
region encoding the hypervariable S1 subunit proved 
very different from all sequences previously published, as 
NA TCoVs shared at least 86% amino acid identity, but 
at best 42% amino acid identity with Fr TCoVs. Fr 
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TCoV sequences also appeared strikingly different from 
the recently released partial sequence of QCoV from 
Italy (Circella et al., 2007). As shown in Figure 2, genetic 
divergence between Fr TCoV and previously released 
sequences was especially apparent between amino acid 
positions 1 to 196 with only 18% amino acid identity 
with NA TCoVs or QCoV and several gaps necessary to 
maintain alignment. Such a region with low amino acid 
conservation is highly evocative of strong antigenic 
variation between NA and French viruses, as observed 
in IBV HVRs (Cavanagh et al., 1988). Interestingly, a 
lack of cross neutralization in spite of high amino acid 
identity (96 to 98%) in the S gene has been reported 
between NA TCoV strains VA-74/03, TX-1038/98, and 
IN-517/94 (Jackwood et al., 2010). Such a result further 
supports the hypothesis of large antigenic variation 
among NA and French turkey viruses, as these two 
groups of viruses share a low amino acid identity, but it 
also indicates that cross-neutralization studies between 
French viruses might be interesting in spite of the fact 
that the three Fr TCoV sequences share 96% amino acid 
identity in their S1 region. The originality of the Fr 
TCoV sequences was confirmed by all phylogenetic 
analysis performed with the S, S1 and even S2 sequences, 
as Fr TCoVs always grouped as a very significant 
subcluster (100% bootstrap value in Figure la and 
data not shown) apart from NA TCoVs, in contrast 
with the previously identified monophyletic TCoV group 
(Jackwood et al., 2010). Although based on a limited 
number of S sequences from Fr TCoV, these results 
suggest that different lineages have evolved in North 
America and Europe. Such a genetic drift had already 
been suspected by comparing partial sequences encom- 
passing several genes at the 3’ end of the genome of NA 
TCoVs with the homologous sequences detected in 
turkeys in the UK (Loa et al., 2006). Only 247 bp, 
spanning from the 3’ end of S gene to the 3’ end of 3a 
gene, overlap between the previously released UK 
sequences (Cavanagh et al., 2001) and those determined 
for the present study. The high identity scores between 
the UK sequence and Fr TCoVs (96 to 99%) suggest 
both viruses could belong to the same group. 

Possible recombination with IBV has been suggested 
recently as an evolutionary mechanism important in the 
molecular epidemiology of NA TCoVs (Jackwood et al., 
2010). To test this hypothesis in Fr TCoVs, we sequenced 
fragments of genes upstream (RdRp) and downstream 
(N) of the S gene. Their phylogenetic analysis further 
supported that Fr TCoV represent a genetic lineage 
significantly different from NA TCoVs (Figure 1b,c). 
However, in contrast with the phylogenetic trees derived 
from the S sequences, which all grouped NA and Fr 
TCoVs in significant clusters formed of turkey viruses 
only, the trees derived from the N or RdRp sequences 
always included some IBV strains clustering within the 
NA or Fr TCoV lineages. The IBV strains most 
phylogenetically related to either NA TCoVs or Fr 
TCoVs were different, thus indicating that whereas 
recombination events could have occurred, they might 
have involved different IBV partners in the American or 
European TCoV lineages. Interestingly, the IBVs most 
related with the N and RdRp genes from Fr TCoVs 
appeared to be 4/91 and QX (ITA/90254/2005), respec- 
tively. These strains represented the last two emerged 
IBV serotypes in Europe (Worthington et al., 2008). It is 
not known whether this observation is consistent with 
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the lack of detection of Fr TCoV in samples collected 
before 1988 and indicates a possible recent recombina- 
tion event involving the recent IBV serotypes. 
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